Skip to content

Weight function bootstrap pull request #223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

juliettelimozin
Copy link
Collaborator

@juliettelimozin juliettelimozin commented Mar 17, 2025

Pull request for the function weight_func_bootstrap that allows us to refit/recalculate new IP weights and merge them to an existing expanded data set

@juliettelimozin
Copy link
Collaborator Author

juliettelimozin commented Mar 17, 2025

Still in draft stage, function runs for 'remodel = TRUE' (for nonparametric bootstrap), need to check for remodel = FALSE and clean up code for potential redundancies

@juliettelimozin
Copy link
Collaborator Author

@gravesti Hello,
To test this script I'd like to use a small simulated dataset generated from my own code for the simulation study in the CI paper. I would calculate the weights using my original code and compare them to the output of this new function. Is it okay to save this dataset in the folder data to fetch it in the test script? Thanks!

@gravesti
Copy link
Contributor

@juliettelimozin Yes that should be fine.

@juliettelimozin juliettelimozin self-assigned this Mar 19, 2025
Copy link
Contributor

github-actions bot commented Mar 19, 2025

badge

Code Coverage Summary

Filename                   Stmts    Miss  Cover    Missing
-----------------------  -------  ------  -------  --------------------------------------------------------------------------------------------------
R/bootstrap.R                442      89  79.86%   143-149, 169-175, 198-200, 213-224, 235-243, 255-263, 366, 418, 518-537, 587-606, 657-659, 674-685
R/calculate_weights.R        132      10  92.42%   29, 57, 85-92
R/data_extension.R           119       1  99.16%   146
R/data_manipulation.R         56       0  100.00%
R/data_preparation.R         163      12  92.64%   143-144, 199-202, 260-261, 271-272, 286-287
R/data_simulation.R           78       6  92.31%   77, 80, 148-152
R/data_utils.R               136      20  85.29%   41-42, 136-144, 156-164
R/expand_trials.R             31       1  96.77%   23
R/generics.R                  65       2  96.92%   108-109
R/initiators.R                48       1  97.92%   136
R/lr_utils.R                  23       0  100.00%
R/modelling.R                116      27  76.72%   68, 77-79, 87, 90-91, 95, 108-110, 118-121, 127-132, 136-139, 188-190
R/predict.R                   96      19  80.21%   64, 113-128, 180-181
R/robust.R                    18       0  100.00%
R/sampling.R                  82       5  93.90%   60-62, 108, 201
R/te_data.R                   40       3  92.50%   106, 113-114
R/te_datastore_csv.R          58       1  98.28%   84
R/te_datastore_duckdb.R       69       5  92.75%   30-57, 77
R/te_datastore.R              18       3  83.33%   34-36
R/te_expansion.R              10       1  90.00%   27
R/te_outcome_model.R          14       0  100.00%
R/te_parsnip.R                26       3  88.46%   78, 91, 100
R/te_stats_glm_logit.R        94      12  87.23%   87, 142-146, 152, 173-177
R/te_weights.R                40      11  72.50%   72, 81, 99-100, 109-110, 122-125, 173
R/trial_sequence.R           296      35  88.18%   326-327, 361, 380, 398-403, 453, 458-459, 462-467, 685-686, 780-784, 795, 821-822, 869, 946-953
R/utils.R                     35       1  97.14%   143
R/weighting.R                202       9  95.54%   175-186
src/code.cpp                  81       6  92.59%   101-106
TOTAL                       2588     283  89.06%

Diff against main

Filename                 Stmts    Miss  Cover
---------------------  -------  ------  -------
R/bootstrap.R             +442     +89  +79.86%
R/calculate_weights.R        0     -16  +12.12%
R/generics.R                +1       0  +0.05%
TOTAL                     +443     +73  -1.14%

Results for commit: ed5c36f

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

Copy link
Contributor

Unit Tests Summary

  1 files   19 suites   1m 42s ⏱️
135 tests 113 ✅ 22 💤 0 ❌
477 runs  444 ✅ 33 💤 0 ❌

Results for commit 18f1dc3.

Copy link
Contributor

github-actions bot commented Mar 19, 2025

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
predict 💚 $47.50$ $-2.27$ $0$ $0$ $0$ $0$
Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
predict 💚 $18.62$ $-1.29$ predict.TE_msm_gives_the_same_results_as_new_predict

Results for commit ed17edd

♻️ This comment has been updated with latest results.

@juliettelimozin
Copy link
Collaborator Author

juliettelimozin commented Mar 24, 2025

@gravesti The weight function is now operating (still need to pass style checks) and the test script is ready. The tests check that weights calculated by weight_func_bootstrap are the expected ones, and if the weight models are refitted, the model coefficients are the same if we were to refit them manually using the existing class structure. I'm not too sure what level of tolerance there should be for this? I put 1e-7 for now, but for some bootstrap resamples, occasionally I get a different coefficient estimate for cense_d1 model and I'm not too sure why

@gravesti
Copy link
Contributor

@juliettelimozin Sorry for the delay in reviewing. Until now we didn't have dplyr as a dependency, to get this merged sooner, I'll add it in. I will get a review done this weekend

@juliettelimozin
Copy link
Collaborator Author

juliettelimozin commented Mar 28, 2025 via email

@gravesti
Copy link
Contributor

gravesti commented Mar 28, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create weight_func_bootstrap()and function for calculating bootstrap CIs for poster presentation
2 participants